Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New result entries table #1092

Merged
22 commits merged into from Nov 21, 2023
Merged

New result entries table #1092

22 commits merged into from Nov 21, 2023

Conversation

ghost
Copy link

@ghost ghost commented Mar 16, 2023

Purpose

Move the test_results.results json array into a dedicated table result_entries.

Context

Replaces #856

Changes

  • Add 2 new tables result_entries and log_level
  • Add foreign key constraints on result_entries(hash_id) -> test_results(hash_id) and result_entries(level) -> log_level(level)
  • The get_test_result and get_test_history methods are modified to use the new table
  • DB::test_result is now only a getter, all write operations use the new database methods add_result_entry and add_result_entries
  • Provide a migration script.

How to test this PR

  • Create a few tests
  • Get the history / results
    It should work the same way as it was

Credits to @blacksponge for most of the work.

@ghost ghost added the V-Minor Versioning: The change gives an update of minor in version. label Mar 16, 2023
@ghost ghost added this to the v2023.1 milestone Mar 16, 2023
@ghost ghost requested review from matsduf, hannaeko, tgreenx and marc-vanderwal March 16, 2023 09:09
@ghost ghost mentioned this pull request Mar 16, 2023
@matsduf
Copy link
Contributor

matsduf commented Mar 16, 2023

Isn't i better to store the hash_id as number?

@ghost
Copy link
Author

ghost commented Mar 16, 2023

Isn't i better to store the hash_id as number?

TBH I've never thought about that. This uses the same datatype as in the test_results table. Maybe we could consider updating the hash_id datatype.

Copy link
Contributor

@matsduf matsduf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the change decrease or increase the load of the database when running a batch?

lib/Zonemaster/Backend/Config.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/DB.pm Outdated Show resolved Hide resolved
@matsduf matsduf modified the milestones: v2023.1, v2023.2 May 2, 2023
Copy link
Contributor

@marc-vanderwal marc-vanderwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the only comment I have, it looks good to me.

marc-vanderwal
marc-vanderwal previously approved these changes Jul 12, 2023
Copy link
Contributor

@marc-vanderwal marc-vanderwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@ghost ghost requested review from matsduf and hannaeko July 12, 2023 14:30
Copy link
Contributor

@matsduf matsduf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we implement this change we should understand the gains and losses of the change. At least Mysql and Postgresql have support for querying into the JSON blob as if it consisted of several fields.

lib/Zonemaster/Backend/DB.pm Outdated Show resolved Hide resolved
Copy link
Author

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we implement this change we should understand the gains and losses of the change. At least Mysql and Postgresql have support for querying into the JSON blob as if it consisted of several fields.

Pro: you could use SQL syntax to query in the table (no need for specific JSON support).
Con: you have to JOIN the tables.

If you see other gain/loss, please share them.

lib/Zonemaster/Backend/DB.pm Outdated Show resolved Hide resolved
@matsduf
Copy link
Contributor

matsduf commented Jul 13, 2023

Both Mysql (Mariadb) and Postgresql have good support for JSON so how much gain is it? Are there any performance differences in saving the result in a table instead of in JSON? Are there any performance differences when reading the result?

@matsduf
Copy link
Contributor

matsduf commented Jul 13, 2023

@PNAX, have estimated how time it will take to migrade the database in the zonemster.net server?

@ghost
Copy link
Author

ghost commented Jul 13, 2023

have estimated how time it will take to migrade the database in the zonemster.net server?

Yes but I can't find the figure again. If I recall correctly, this was around 2 to 3 hours.

@matsduf
Copy link
Contributor

matsduf commented Jul 13, 2023

Are there any performance differences when reading the result?

@matsstralbergiis will try to measure the difference.

@PNAX, have you compared load of writing with old and new code?

@ghost
Copy link
Author

ghost commented Jul 13, 2023

Both Mysql (Mariadb) and Postgresql have good support for JSON so how much gain is it?

See #856 (comment) and #1092 (review). Another gain would be to ease the use of another database engine (for instance Clickhouse, see #1094).

Regarding the performance gain, I need to redo some tests. (I thought I had done some a few months back but I can't find my results anymore.)

@matsduf
Copy link
Contributor

matsduf commented Jul 13, 2023

See #856 (comment) and #1092 (review). Another gain would be to ease the use of another database engine (for instance Clickhouse, see #1094).

The comment in #856 was mine, but today I am better informed and understand that it actually works to let the database engine look into JSON. But it still might be a good idea to use a separate table. It is conceptually simpler. And possibly better support for other engines.

Regarding the performance gain, I need to redo some tests. (I thought I had done some a few months back but I can't find my results anymore.)

Thanks!

@result_entries = map {
{
%$_,
args => decode_json( $_->{args} ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if performance is an issue for test_results, but if it is I have a hunch we're spending a lot of cycles on this decoding and re-encoding of all the JSON-blobs.

  1. Do you know if it would be possible do some hack with string concatenation? Even though it's ugly, if it's possible it might be worth it.
  2. Also, do you know if decoding lots of small blobs is faster or slower or the same as decoding one big blob?

lib/Zonemaster/Backend/DB/PostgreSQL.pm Outdated Show resolved Hide resolved
share/patch/patch_db_zonemaster_backend_ver_11.0.0.pl Outdated Show resolved Hide resolved
share/patch/patch_db_zonemaster_backend_ver_11.0.0.pl Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/DB.pm Outdated Show resolved Hide resolved
@matsstralbergiis
Copy link
Collaborator

matsstralbergiis commented Jul 25, 2023

I have copied the 'test_result' table to a MySQL instance running in my Mac. The database contains 208 149 tests.

Then I created the new 'result_entries' table (I copied the create command from the MySQL.pm in this PR.) and extracted the data from the JSON blob in the 'test_result' table into the nwe table. It now contains 14 518 570 records.

After that I performed two queries that I believe gets the same result. (batch 10 contains 81689 tests)

One to the 'old' format:

SELECT tr.batch_id, tr.hash_id, tr.domain, result.tag, count(result.tag), result.level, result.module, result.testcase FROM    test_results tr,   JSON_TABLE(     CAST(CONVERT(results USING utf8) AS JSON),     "$[*]" COLUMNS(        tag VARCHAR(40) PATH '$.tag' ERROR ON ERROR,       level VARCHAR(10) PATH '$.level' ERROR ON ERROR,       module VARCHAR(20) PATH '$.module' ERROR ON ERROR,       testcase VARCHAR(20) PATH '$.testcase' ERROR ON ERROR     )   ) as result WHERE batch_id = 10 GROUP BY tr.hash_id, tr.domain, result.tag, result.level, result.module, result.testcase ORDER BY tr.hash_id limit 10;
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| batch_id | hash_id          | domain               | tag                        | count(result.tag) | level | module       | testcase       |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ADDRESSES_MATCH            |                 1 | INFO  | CONSISTENCY  | CONSISTENCY05  |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ARE_AUTHORITATIVE          |                 1 | INFO  | DELEGATION   | DELEGATION04   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | AXFR_FAILURE               |                 8 | INFO  | NAMESERVER   | NAMESERVER03   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_CHILD_FOUND            |                 1 | INFO  | BASIC        | BASIC01        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_PARENT_FOUND           |                 1 | INFO  | BASIC        | BASIC01        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B02_AUTH_RESPONSE_SOA      |                 1 | INFO  | BASIC        | BASIC02        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CAN_BE_RESOLVED            |                 1 | INFO  | NAMESERVER   | NAMESERVER06   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CASE_QUERIES_RESULTS_OK    |                 1 | INFO  | NAMESERVER   | NAMESERVER09   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CHILD_DISTINCT_NS_IP       |                 1 | INFO  | DELEGATION   | DELEGATION02   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CN04_IPV4_DIFFERENT_PREFIX |                 1 | INFO  | CONNECTIVITY | CONNECTIVITY04 |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
10 rows in set, 1 warning (52.8186 sec)

And one to the new format:

SELECT tr.batch_id, tr.hash_id, tr.domain, result.tag, count(result.tag), result.level, result.module, result.testcase FROM result_entries result LEFT JOIN test_results tr ON result.hash_id = tr.hash_id WHERE batch_id = 10 GROUP BY tr.hash_id, tr.domain, result.tag, result.level, result.module, result.testcase ORDER BY tr.hash_id limit 10;
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
| batch_id | hash_id          | domain               | tag                        | count(result.tag) | level | module       | testcase       |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ADDRESSES_MATCH            |                 1 | INFO  | CONSISTENCY  | CONSISTENCY05  |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | ARE_AUTHORITATIVE          |                 1 | INFO  | DELEGATION   | DELEGATION04   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | AXFR_FAILURE               |                 8 | INFO  | NAMESERVER   | NAMESERVER03   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_CHILD_FOUND            |                 1 | INFO  | BASIC        | BASIC01        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B01_PARENT_FOUND           |                 1 | INFO  | BASIC        | BASIC01        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | B02_AUTH_RESPONSE_SOA      |                 1 | INFO  | BASIC        | BASIC02        |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CAN_BE_RESOLVED            |                 1 | INFO  | NAMESERVER   | NAMESERVER06   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CASE_QUERIES_RESULTS_OK    |                 1 | INFO  | NAMESERVER   | NAMESERVER09   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CHILD_DISTINCT_NS_IP       |                 1 | INFO  | DELEGATION   | DELEGATION02   |
|       10 | 000081efd2d471e0 | xn--nsbyslott-v2a.se | CN04_IPV4_DIFFERENT_PREFIX |                 1 | INFO  | CONNECTIVITY | CONNECTIVITY04 |
+----------+------------------+----------------------+----------------------------+-------------------+-------+--------------+----------------+
10 rows in set (25.3378 sec)

The results are consistent when repeating the queries.

Maybe it is possible to optimize the queries to speed it up. Feel free to suggest improvements.

@matsduf matsduf dismissed their stale review July 25, 2023 13:49

At least, it does not seems like we lose any performance. The new format seems to take half the time for database lookups.

@matsduf
Copy link
Contributor

matsduf commented Jul 26, 2023

If databases are better at handling integer keys, wouldn't it be better to convert the hash_id into an integer instead?

@ghost
Copy link
Author

ghost commented Nov 2, 2023

I've rebased on top of develop to integrate the changes from #1121 and address a comment from @mattias-p. I think I should have then addressed all comments/remarks. Please re-review.

Copy link
Member

@hannaeko hannaeko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work :) Just a few cleanup remarks.

lib/Zonemaster/Backend/TestAgent.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/TestAgent.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/TestAgent.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/TestAgent.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/TestAgent.pm Outdated Show resolved Hide resolved
lib/Zonemaster/Backend/DB.pm Outdated Show resolved Hide resolved
* remove unnecessary comments
* rename a variable
* rename a metric
Copy link
Member

@hannaeko hannaeko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an issue in the migration script, beside it I tested creating a test and querying for the history without issue.

* output the total number of updated rows at the end (SQLite and MySQL)
* store level value
* only update job with results (SQLite and MySQL)
Copy link
Member

@mattias-p mattias-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good to me! I only have a few remarks.

push @records, $r;
}
my $query_values = join ", ", ("(?, ?, ?, ?, ?, ?, ?)") x @records;
my $query = "INSERT INTO result_entries (hash_id, level, module, testcase, tag, timestamp, args) VALUES $query_values";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be super-nice if we could verify that the job is "running" when performing the insert and throw an error otherwise. I'm open to leaving this as a future exercise though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Just a question on how to do this. I was thinking to add something like:

    if ( $current_state ne $TEST_RUNNING ) {
        die Zonemaster::Backend::Error::Internal->new( reason => 'cannot store results to a non-running test' );
    }

but I think it won't be sufficient. What if the test's state is updated by another process between the check and the insert?

Copy link
Member

@mattias-p mattias-p Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly. I made a proof of concept for how to make it atomic. But I'm not sure about its scalability. Maybe it's great, I don't know. And I also haven't tried it in the other DB engines.

#!/usr/bin/env perl
use strict;
use warnings;
use 5.014;
use utf8;

use DBI;

my $dbh = DBI->connect( "dbi:SQLite:dbname=:memory:", "", "" );

sub setup {
    $dbh->do( '
        CREATE TABLE test_results (
            id       INTEGER,
            progress INTEGER
        )
    ' );
    $dbh->do( '
        CREATE TABLE result_entries (
            test_id INTEGER,
            text    VARCHAR(20)
        )
    ' );
    $dbh->do( 'INSERT INTO test_results VALUES (1234, 0)' );
    $dbh->do( 'INSERT INTO test_results VALUES (2345, 50)' );
    $dbh->do( 'INSERT INTO test_results VALUES (3456, 100)' );
}

sub insert {
    my ( $id, @messages ) = @_;

    my $prefix    = 'INSERT INTO result_entries ';
    my $separator = ' UNION ';
    my $select    = 'SELECT id, ? FROM test_results WHERE id = ? AND progress > 0 AND progress < 100';
    my $query     = $prefix . join $separator, ( $select ) x scalar @messages;
    my $sth       = $dbh->prepare( $query );

    my @params;
    for my $message ( @messages ) {
        push @params, $message, $id;
    }

    my $rv = $sth->execute( @params );
    if ( $rv && $rv == 0 ) {
        die "illegal state transition";
    }
}

sub show {
    my ( $table ) = @_;
    my $rows = $dbh->selectall_arrayref( "SELECT * FROM $table" );
    say "Dump of $table:";
    for my $row ( @$rows ) {
        say join ', ', @{$row};
    }
}

my $id       = shift @ARGV;
my @messages = @ARGV;

setup();
show( 'test_results' );

say 'Inserting...';
insert( $id, @messages );

show( 'result_entries' );
$ ./poc 1234 aaa bbb ccc                                                                  
Dump of test_results:
1234, 0
2345, 50
3456, 100
Inserting...
illegal state transition at ./poc line 45.
$ ./poc 2345 aaa bbb ccc                                                                  
Dump of test_results:
1234, 0
2345, 50
3456, 100
Inserting...
Dump of result_entries:
2345, aaa
2345, bbb
2345, ccc

Copy link
Author

@ghost ghost Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proof of concept. I think I'll go with your suggestion to "leav[e] this as a future exercise". I've open an issue for that, see #1135.

lib/Zonemaster/Backend/DB/MySQL.pm Show resolved Hide resolved
lib/Zonemaster/Backend/DB/PostgreSQL.pm Show resolved Hide resolved
lib/Zonemaster/Backend/DB/SQLite.pm Show resolved Hide resolved
Copy link
Contributor

@matsduf matsduf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me.

@ghost ghost merged commit dfd197d into zonemaster:develop Nov 21, 2023
1 check passed
@mattias-p
Copy link
Member

v2023.2 Release Testing

The "How to test this PR" section is not very explicit and doesn't concern itself with the database migration at all, so I made up the following procedure.

  1. Create dist tarballs for Zonemaster-LDNS, Zonemaster-Engine and Zonemaster-Backend from both the master branch and the develop branch and place them in the $HOME/old and $HOME/new directories.
  2. Using the master branch dist tarballs, follow the installation instructions for Engine and Backend up to but excluding the "Database engine installation" section.
  3. Set age_reuse_previous_test = 60 for good measure.
  4. Take a snapshot of the machine and name it "old-before-db" (to be restored in later steps).
  5. For each DB engine as $db:
    1. Complete Backend installation using $db.
    2. Run zmtest nu > $HOME/$db-nu.old.result and save the test id as $test_id_nu.
    3. Run zmtest SE. > $HOME/$db-se.old.result and save the test id as $test_id_se.
    4. Take a snapshot of the machine and name it "old-$db" (in case we need get back here).
    5. Run sudo systemctl stop zm-rpcapi zm-testagent.
    6. Run sudo cpanm --notest $HOME/new/*.
    7. Run cd $(perl -MFile::ShareDir=dist_dir -E 'say dist_dir("Zonemaster-Backend")').
    8. Run sudo perl patch/patch_db_zonemaster_backend_ver_11.0.3.pl`.
    9. Run sudo systemctl start zm-rpcapi zm-testagent.
    10. Run zmb get_test_results --lang en --test-id $test_id_nu | jq -S > $HOME/$db-nu.migrated.result.
    11. Run zmb get_test_results --lang en --test-id $test_id_se | jq -S > $HOME/$db-se.migrated.result.
    12. Run zmtest nu > $HOME/$db-nu.new.result.
    13. Run zmtest SE. > $HOME/$db-se.new.result.
    14. Verify that the $HOME/$db-nu.* files are essentially identical.
    15. Verify that the $HOME/$db-se.* files are essentially identical.
    16. Take a snapshot of the machine and name it "new-$db" (in case we need get back here).
    17. Restore the "old-before-db" snapshot.

When following this procedure I found a couple of problems.

  1. For SQLite < 3.32.0 the current version of the schema migration script (develop) is only able to migrate tests with 142 messages or fewer. This is an unreasonably small limit. For SQLite > 3.32.0 the limit is 4680. On Rocky Linux 8.9 we're using SQLite 3.26.0.

    I'm not sure if 4680 is a suffient limit. The limit could be set dynamically to allow any number of messages, that isn't high for SQLite itself, of course. You can set the limit something like this: $dbh->sqlite_limit( DBD::SQLite::Constants::SQLITE_LIMIT_SQL_LENGTH, $new_value )

    This is bad but I don't think it's a blocker. People who want to keep their test results are probably using MariaDB or PostgreSQL anyway.

    We ought to create a tool for migrating databases from one DB engine to another within the same version of Backend.

  2. If you upgrade the database but forget to restart testagent, you get tests with empty results.

    This is bad and confusing but I don't think it's a blocker.

  3. After migrating a database from the previous schema version to the new one, get_test_results no longer reports properly translated message strings. Instead the message strings are returned in "raw" format.

    This is because the schema migration script and the RPCAPI daemon disagree on the format of the result_entries.module and result_entries.testcase colums. The daemon reads them case sensitively and expects most of their values to be capitalized. The script writes them in upper case and seems to assume they are case insensitive.

    Even though reports of migrated test results are more cryptic to the human reader there is no data loss and proper order can be restored using an additional migration script in a fix release. Or we could treat this as a blocker and fix the migration script before releasing.

mattias-p added a commit to mattias-p/zonemaster-backend that referenced this pull request Jan 18, 2024
For some reason get_test_results used to add 'ns' arguments to entries
from the Nameserver module. These are interpolated into the message
anyway, and no other entries are returned with raw arguments. In zonemaster#1092
we accidentally broke this behavior, but since the behavior was
undocument and unintended, it's better to clean it up than to fix it.
@mattias-p mattias-p mentioned this pull request Jan 18, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
V-Minor Versioning: The change gives an update of minor in version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants